18 research outputs found

    Visually-Grounded Language Model for Human-Robot Interaction

    Get PDF
    Visually grounded human-robot interaction is recognized to be an essential ingredient of socially intelligent robots, and the integration of vision and language increasingly attracts attention of researchers in diverse fields. However, most systems lack the capability to adapt and expand themselves beyond the preprogrammed set of communicative behaviors. Their linguistic capabilities are still far from being satisfactory which make them unsuitable for real-world applications. In this paper we will present a system in which a robotic agent can learn a grounded language model by actively interacting with a human user. The model is grounded in the sense that meaning of the words is linked to a concrete sensorimotor experience of the agent, and linguistic rules are automatically extracted from the interaction data. The system has been tested on the NAO humanoid robot and it has been used to understand and generate appropriate natural language descriptions of real objects. The system is also capable of conducting a verbal interaction with a human partner in potentially ambiguous situations

    BAYESIAN APPROACHES TO HUMAN-ROBOT INTERACTION: FROM LANGUAGE GROUNDING TO ACTION LEARNING AND UNDERSTANDING

    Get PDF
    In human-robot interaction field, the robot is no longer considered as a tool but as a partner, which supports the work of humans. Environments that feature the interaction and collaboration of humans and robots present a number of challenges involving robot learning and interactive capabilities. In order to operate in these environments, the robot must not only be able to do, but also be able to interact and especially to \u201dunderstand\u201d. This thesis proposes a unified probabilistic framework that allows a robot to develop basic cognitive skills essential for collaboration. To this aim we embrace the idea of motor simulation - well established in cognitive science and neuroscience - in which the robot reenacts in simulation its own internal models used for physically performing action. This particular view offers the possibility to unify apparently distinct cognitive phenomena such as learning, interaction, understanding and dialogue, just to name a few. Ideas presented here are corroborated by experimental results performed both in simulation and on a humanoid robotic platform. The first contribution in this direction is a robust Bayesian method to estimate (i.e. learn) the parameters of internal models by observing other skilled actors performing goal-directed actions. In addition to deriving a theoretically sound solution for the learning problem, our approach establishes theoretical links between Bayesian inference and gradient-based optimization methods. Using the expectation propagation (EP) algorithm, a similar algorithm is derived for multiple internal models scenario. Once learned, internal models are reused in simulation to \u201dunderstand\u201d actions performed by other actors, which is a necessary precondition for successful interaction. We have proposed that action understanding can be cast as an approximate Bayesian inference in which the covert activity of internal models produces hypotheses that are tested in parallel through a sequential Monte Carlo approach. Here, approximate Bayesian inference is offered as a plausible mechanistic implementation of the idea of motor simulation making it feasible in real-time and with limited resources. Finally, we have investigated how the robot can learn a grounded language model in order to be bootstrapped into communication. Features extracted from the learned internal models, as well as descriptors of various perceptual categories, are fed into a novel multi-instance semi-supervised learning algorithm able to perform semantic clustering and associate words, either nouns or verbs, with their grounded meaning

    Resolving ambiguities in a grounded human-robot interaction

    No full text
    In this paper we propose a trainable system that learns grounded language models from examples with a minimum of user intervention and without feedback. We have focused on the acquisition of grounded meanings of spatial and adjective/noun terms. The system has been used to understand and subsequently to generate appropriate natural language descriptions of real objects and to engage in verbal interactions with a human partner. We have also addressed the problem of resolving eventual ambiguities arising during verbal interaction through an information theoretic approach

    A Probabilistic Approach to Learning a Visually Grounded Language Model through Human-Robot Interaction

    No full text
    A Language is among the most fascinating and complex cognitive activities that develops rapidly since the early months of infants' life. The aim of the present work is to provide a humanoid robot with cognitive, perceptual and motor skills fundamental for the acquisition of a rudimentary form of language. We present a novel probabilistic model, inspired by the findings in cognitive sciences, able to associate spoken words with their perceptually grounded meanings. The main focus is set on acquiring the meaning of various perceptual categories (e. g. red, blue, circle, above, etc.), rather than specific world entities (e. g. an apple, a toy, etc.). Our probabilistic model is based on a variant of multi-instance learning technique, and it enables a robotic platform to learn grounded meanings of adjective/noun terms. The systems could be used to understand and generate appropriate natural language descriptions of real objects in a scene, and it has been successfully tested on the NAO humanoid robotic platform
    corecore